Controlling deliberation in a Markov decision process-based agent

نویسندگان

George Alexander

Anita Raja

David J. Musliner

چکیده

Meta-level control manages the allocation of limited resources to deliberative actions. This paper discusses efforts in adding meta-level control capabilities to a Markov Decision Process (MDP)-based scheduling agent. The agent’s reasoning process involves continuous partial unrolling of the MDP state space and periodic reprioritization of the states to be expanded. The meta-level controller makes situation-specific decisions on when the agent should stop unrolling in order to derive a partial policy while bounding the costs of state reprioritization. The described approach uses performance profiling combined with multi-level strategies in its decision making. We present results showing the performance advantage of dynamic meta-level control for this complex agent.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Flexibly Integrating Deliberation and Execution in Decision-Theoretic Agents

We are developing software agents that plan, schedule, and coordinate complex behavior in uncertain environments by reasoning about dynamically-constructed Markov Decision Problems (MDPs). What makes the work we present here different from traditional MDP-based agent systems is that an agent in our system might lack the time and/or knowledge to build its complete MDP and corresponding optimal p...

متن کامل

Other Agents' Actions as Asynchronous Events

An individual planning agent does not generally have sufficient computational resources at its disposal to produce an optimal plan in a complex domain, as deliberation itself requires and consumes scarce resources. This problem is further exacerbated in a distributed planning context in which multiple, heterogeneous agents must expend a portion of their resource allotment on communication, nego...

متن کامل

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

Influence-Based Autonomy Levels in Agent Decision-Making

Autonomy is a crucial and powerful feature of agents and it is the subject of much research in the agent field. Controlling the autonomy of agents is a way to coordinate the behavior of groups of agents. Our approach is to look at it as a design problem for agents. We analyze the autonomy of an agent as a gradual property that is related to the degree of intervention of other agents in the deci...

متن کامل

A new machine replacement policy based on number of defective items and Markov chains

A novel optimal single machine replacement policy using a single as well as a two-stage decision making process is proposed based on the quality of items produced. In a stage of this policy, if the number of defective items in a sample of produced items is more than an upper threshold, the machine is replaced. However, the machine is not replaced if the number of defective items is less than ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Controlling deliberation in a Markov decision process-based agent

نویسندگان

چکیده

منابع مشابه

Flexibly Integrating Deliberation and Execution in Decision-Theoretic Agents

Other Agents' Actions as Asynchronous Events

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Influence-Based Autonomy Levels in Agent Decision-Making

A new machine replacement policy based on number of defective items and Markov chains

عنوان ژورنال:

اشتراک گذاری